Rank in Wordlist | Frequency | Word |
---|---|---|
2611 | 36 | 1,5 |
7759 | 10 | 2,5 |
9169 | 8 | 3,5 |
10105 | 7 | 5,4 |
11722 | 6 | Pirritx, Porrotx eta Marimotots |
12865 | 5 | 2,8 |
12895 | 5 | 4,5 |
14926 | 4 | 1,4 |
15008 | 4 | 278.975,10 |
15012 | 4 | 3,6 |
Rank in Wordlist | Frequency | Word |
---|---|---|
51807 | 1 | bilketa%15,1 |
65911 | 1 | proiektuaren%65 |
69345 | 1 | —%30 |
Rank in Wordlist | Frequency | Word |
---|---|---|
45735 | 1 | Pris&Batty |
66081 | 1 | rhythm&blues |
66086 | 1 | rock&rol |
Rank in Wordlist | Frequency | Word |
---|---|---|
18651 | 3 | D'Elikatuz |
34436 | 1 | 20'5 |
38355 | 1 | D'Anoiarekin |
38356 | 1 | D'elikatuz |
41313 | 1 | Hitza'-ko |
42052 | 1 | Italiano'-k |
43980 | 1 | Marron'-ek |
Rank in Wordlist | Frequency | Word |
---|---|---|
25395 | 2 | I+G |
34588 | 1 | 23+1 |
41447 | 1 | I+G+B-an |
43099 | 1 | LGTBIQ+aren |
44025 | 1 | Master+45 |
Rank in Wordlist | Frequency | Word |
---|---|---|
11287 | 6 | 2019/2020 |
11288 | 6 | 2020/2021 |
12886 | 5 | 24/7 |
14985 | 4 | 2020/21 |
16804 | 4 | eta/edo |
25984 | 2 | Legazpi/Urretxu |
28937 | 2 | edukiontzi/kubo |
30421 | 2 | https://labur |
33560 | 1 | 0070/LI/2020 |
33990 | 1 | 14/24 |
In the last subsection of this type we look for words containing other special characters: , ( ) % & $
" ' + * = / _
Depending on the language some of these characters may be allowed within words, other will not. If words with forbidden characters do not have very low frequency there might be a problem in preprocessing.
Words containing %:
select w_id-100,freq, word from words where w_id>100 and word like "%\%%" limit 10;
3.12.1 Words with Hyphens
3.12.2 Multiwords
3.12.3 (Multi-)Words with dots